Example: In-training validation¶
This example shows how to keep track of the model's performance during training.
Import the breast cancer dataset from sklearn.datasets. This is a small and easy to train dataset whose goal is to predict whether a patient has breast cancer or not.
Load the data¶
InĀ [1]:
Copied!
# Import packages
from sklearn.datasets import load_breast_cancer
from atom import ATOMClassifier
# Import packages
from sklearn.datasets import load_breast_cancer
from atom import ATOMClassifier
InĀ [2]:
Copied!
# Load the data
X, y = load_breast_cancer(return_X_y=True)
# Load the data
X, y = load_breast_cancer(return_X_y=True)
Run the pipeline¶
InĀ [3]:
Copied!
# Initialize atom
atom = ATOMClassifier(X, y, verbose=2, random_state=1)
# Initialize atom
atom = ATOMClassifier(X, y, verbose=2, random_state=1)
<< ================== ATOM ================== >> Configuration ==================== >> Algorithm task: Binary classification. Dataset stats ==================== >> Shape: (569, 31) Train set size: 456 Test set size: 113 ------------------------------------- Memory: 141.24 kB Scaled: False Outlier values: 167 (1.2%)
InĀ [4]:
Copied!
# Not all models support in-training validation
# You can chek which ones do using the available_models method
df = atom.available_models()[["acronym", "model", "has_validation"]]
df[df["has_validation"]]
# Not all models support in-training validation
# You can chek which ones do using the available_models method
df = atom.available_models()[["acronym", "model", "has_validation"]]
df[df["has_validation"]]
Out[4]:
| acronym | model | has_validation | |
|---|---|---|---|
| 3 | CatB | CatBoost | True |
| 15 | LGB | LightGBM | True |
| 19 | MLP | MultiLayerPerceptron | True |
| 21 | PA | PassiveAggressive | True |
| 22 | Perc | Perceptron | True |
| 27 | SGD | StochasticGradientDescent | True |
| 29 | XGB | XGBoost | True |
InĀ [5]:
Copied!
# Run the models normally
atom.run(models=["MLP", "LGB"], metric="auc")
# Run the models normally
atom.run(models=["MLP", "LGB"], metric="auc")
Training ========================= >> Models: MLP, LGB Metric: auc Results for MultiLayerPerceptron: Fit --------------------------------------------- Train evaluation --> auc: 0.9997 Test evaluation --> auc: 0.9936 Time elapsed: 1.821s ------------------------------------------------- Time: 1.821s Results for LightGBM: Fit --------------------------------------------- Train evaluation --> auc: 1.0 Test evaluation --> auc: 0.9775 Time elapsed: 0.352s ------------------------------------------------- Time: 0.352s Final results ==================== >> Total time: 2.236s ------------------------------------- MultiLayerPerceptron --> auc: 0.9936 ! LightGBM --> auc: 0.9775
Analyze the results¶
InĀ [6]:
Copied!
atom.plot_evals(title="In-training validation scores")
atom.plot_evals(title="In-training validation scores")
InĀ [7]:
Copied!
# Plot the validation on the train and test set
atom.lgb.plot_evals(dataset="train+test", title="LightGBM's in-training validation")
# Plot the validation on the train and test set
atom.lgb.plot_evals(dataset="train+test", title="LightGBM's in-training validation")